Fusion of Multiple Features and Ranking SVM for Web-based English-Chinese OOV Term Translation

نویسندگان

  • Yuejie Zhang
  • Yang Wang
  • Lei Cen
  • Yanxia Su
  • Cheng Jin
  • Xiangyang Xue
  • Jianping Fan
چکیده

This paper focuses on the Web-based English-Chinese OOV term translation pattern, and emphasizes particularly on the translation selection strategy based on the fusion of multiple features and the ranking mechanism based on Ranking Support Vector Machine (Ranking SVM). By utilizing the CoNLL2003 corpus for the English Named Entity Recognition (NER) task and selected new terms, the experiments based on different data sources show the consistent results. Our OOV term translation model can “filter” the most possible translation candidates with better ability. From the experimental results for combining our OOV term translation model with English-Chinese CrossLanguage Information Retrieval (CLIR) on the data sets of Text Retrieval Evaluation Conference (TREC), it can be found that the obvious performance improvement for both query translation and retrieval can also be obtained.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Cross-language Information Retrieval via Disambiguation and Vocabulary Discovery

Cross-lingual information retrieval (CLIR) allows people to find documents irrespective of the language used in the query or document. This thesis is concerned with the development of techniques to improve the effectiveness of Chinese–English CLIR. In Chinese–English CLIR, the accuracy of dictionary-based query translation is limited by two major factors: translation ambiguity and the presence ...

متن کامل

English-Chinese Bi-Directional OOV Translation based on Web Mining and Supervised Learning

In Cross-Language Information Retrieval (CLIR), Out-of-Vocabulary (OOV) detection and translation pair relevance evaluation still remain as key problems. In this paper, an English-Chinese Bi-Directional OOV translation model is presented, which utilizes Web mining as the corpus source to collect translation pairs and combines supervised learning to evaluate their association degree. The experim...

متن کامل

Web based English-Chinese OOV term translation using Adaptive rules and Recursive feature selection

Cross-Language Information Retrieval (CLIR) system uses dictionaries for information retrieval. However, out of vocabulary (OOV) terms cannot be found in dictionaries. Although many researchers in the past have endeavored to solve the OOV term translation problem, but little attention has been paid to hybrid translations “α1antitrypsin deficiency (α1-抗胰蛋白酶缺乏症)”. This paper presents a novel OOV ...

متن کامل

RMIT Chinese-English CLIR at NTCIR-4

We participated in the Chinese-English CLIR task, concentrating primarily on the issues of translation disambiguation and automatic translation extraction of OOV terms. A new technique to identify and translate Chinese OOV terms using the web was developed. The results for this aspect of our work appears promising.

متن کامل

Fusion of Multiple Features and Supervised Learning for Chinese OOV Term Detection and POS Guessing

In this paper, to support more precise Chinese Out-of-Vocabulary (OOV) term detection and Part-of-Speech (POS) guessing, a unified mechanism is proposed and formulated based on the fusion of multiple features and supervised learning. Besides all the traditional features, the new features for statistical information and global contexts are introduced, as well as some constraints and heuristic ru...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010